Realization of Extensions to Faddeev Algorithm on Array of Simd Processors

نویسندگان

Hai Van Dinh

Marek A. Perkowski

چکیده

The paper presents three types of extensions: venical, horizontal, and two-dimensional to the new, Faddeev algorithm based, systolic architecture for mamx computations, presented in: H.VD. Le, M A . Perkowski. ”A New General Purpose Sysrolic Archirecrure for Matrix Computationr”. Proc. Intern. Con$ on Computing and Information, ICC1’89, Ontario, Canada, 1989. It has essential advantages over previous architectures of this type and Ends various applications: extensions to Faddeev algorithm can be used in many problems, including Karmarkar algorithm. The extensions described in this paper not only increase a system throughput from two to four fold but also enhance the inherent programmability of Faddeev’s algorithm. This allows our architecture to perform very complex matrix calculations.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Feasibility Analysis of Ultra High Frame Rate Visual Servoing on FPGA and SIMD Processor

Visual servoing has been proven to obtain better performance than mechanical encoders for position acquisition. However, the often computationally intensive vision algorithms and the ever growing demands for higher frame rate make its realization very challenging. This work performs a case study on a typical industrial application, organic light emitting diode (OLED) screen printing, and demons...

متن کامل

1000 fps Visual Servoing on the Reconfigurable Wide SIMD Processor

Visual servoing has been proven to obtain better performance than encoders at comparable cost. However, the often computationally intensive vision algorithms and the ever growing demands for higher frame rate make its realization very challenging. This paper demonstrated the feasibility of achieving high frame-rate visual servoing applications on the wide Single-Instruction-MultipleData (SIMD) ...

متن کامل

Applying SIMD Approach to Whole Genome Comparison on Commodity Hardware

Whole genome comparison compares (aligns) two genome sequences assuming that analogous characteristics may be found. In this paper, we present an SIMD version of the Smith-Waterman algorithm utilizing Streaming SIMD Extensions (SSE), running on Intel Pentium processors. We compare two approaches, one requiring explicit data dependency handling and one built to automatically handle dependencies ...

متن کامل

Faster Incoherent Ray Traversal Using 8-Wide AVX Instructions

Efficiently tracing randomly distributed rays is a highly challenging problem on wide-SIMD processors. The MBVH (multi bounding volume hierarchy) is an acceleration structure specifically designed for incoherent ray tracing on processors with explicit SIMD architectures like the CPU. Existing MBVH traversal methods for CPUs target 4-wide SIMD architectures using the SSE instruction set. Recentl...

متن کامل

An Implementation of Parallel 1-D FFT Using SSE3 Instructions on Dual-Core Processors

In the present paper, an implementation of a parallel one-dimensional fast Fourier transform (FFT) using Streaming SIMD Extensions 3 (SSE3) instructions on dual-core processors is proposed. Combination of vectorization and the block six-step FFT algorithm is shown to effectively improve performance. The performance results for one-dimensional FFTs on dual-core Intel Xeon processors are reported...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2004

Realization of Extensions to Faddeev Algorithm on Array of Simd Processors

نویسندگان

چکیده

منابع مشابه

Feasibility Analysis of Ultra High Frame Rate Visual Servoing on FPGA and SIMD Processor

1000 fps Visual Servoing on the Reconfigurable Wide SIMD Processor

Applying SIMD Approach to Whole Genome Comparison on Commodity Hardware

Faster Incoherent Ray Traversal Using 8-Wide AVX Instructions

An Implementation of Parallel 1-D FFT Using SSE3 Instructions on Dual-Core Processors

عنوان ژورنال:

اشتراک گذاری